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8.4.1 Introduction 

Often, we need to test whether a hypothesis is true or false. For example, a pharmaceutical company 
might be interested in knowing if a new drug is effective in treating a disease. Here, there are two 
hypotheses. The first one is that the drug is not effective, while the second hypothesis is that the drug 
is effective. We call these hypotheses Hq and H \ respectively. As another example, consider a 
radar system that uses radio waves to detect aircraft. The system receives a signal and, based on the 
received signal, it needs to decide whether an aircraft is present or not. Here, there are again two 
opposing hypotheses: 


Ho : No aircraft is present. 

: An aircraft is present. 

The hypothesis Hq is called the null hypothesis and the hypothesis H\ is called the alternative 
hypothesis. The null hypothesis, Hq, is usually referred to as the default hypothesis, i.e., the 
hypothesis that is initially assumed to be true. The alternative hypothesis, H\ , is the statement 
contradictory to Hq. Based on the observed data, we need to decide either to accept Hq, or to reject 
it, in which case we say we accept H\ . These are problems of hypothesis testing. In this section, we 
will discuss how to approach such problems from a classical (frequentist) point of view. We will 
start with an example, and then provide a general framework to approach hypothesis testing 
problems. When looking at the example, we will introduce some terminology that is commonly used 
in hypothesis testing. Do not worry much about the terminology when reading this example as we 
will provide more precise definitions later on. 


Example 8.22 

You have a coin and you would like to check whether it is fair or not. More specifically, let 9 be the 
probability of heads, 9 = P[H). You have two hypotheses: 


Hq (the null hypothesis): The coin is fair, i.e. 9 = 9 0 = \. 

H\ (the alternative hypothesis): The coin is not fair, i.e., 9 ^ \- 
• Solution 

o We need to design a test to either accept Hq or H x . To check whether the coin is fair 
or not, we perform the following experiment. We toss the coin 100 times and record the 
number of heads. Let X be the number of heads that we observe, so 

X ~ Binomial ( 100 , 9). 

Now, if Hq is true, then 9 = 9q = j, so we expect the number of heads to be close 
to 50 . Thus, intuitively we can say that if we observe close to 50 heads we should 
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accept Hq, otherwise we should reject it. More specifically, we suggest the following 
criteria: If | X — 501 is less than or equal to some threshold, we accept Hq . On the 
other hand, if | X — 501 is larger than the threshold we reject Hq and accept H\. 
Let's call that threshold t. 


If \X — 50 1 < t, accept Hq. 

If \X — 50 1 > t, accept H\. 

But how do we choose the threshold tl To choose t properly, we need to state some 
requirements for our test. An important factor here is probability of error. One way to 
make an error is when we reject Hq while in fact it is true. We call this type I error. 
More specifically, this is the event that | X — 50 1 > t when Hq is true. Thus, 

P( type I error) = P(\X — 501 > t \ Hq). 

We read this as the probability that | X — 50 1 > t when Hq is true. (Note that, here, 
P{\X — 50 1 > t | Hq) is not a conditional probability, since in classical statistics 
we do not treat Hq and H\ as random events. Another common notation is 
P(\X — 50| > t when Hq is true) .) To be able to decide what t needs to be, 
we can choose a desired value for P (type I error) . For example, we might want to 
have a test for which 

P(type I error) < a = 0.05 

Here, Ct is called the level of significance. We can choose 

P(|X-50| >t\Ho) = a = 0.05 (8.2) 

to satisfy the desired level of significance. Since we know the distribution of X under 
Hq, i.e., X\ Hq ~ Binomial^ 100, 6 = |),we should be able to choose t such 
that Equation 8.2 holds. Note that by the central limit theorem (CLT), for large values of 
72, we can approximate a Binomial (n, 6) distribution by a normal distribution. 
More specifically, we can say that for large values of 72, if 
X ~ Binomial (u,0q = -|),then 

y _ X-n&o _ X - 50 

\Ai0 o (i ~°o) 5 

is (approximately) a standard normal random variable, iV(0, 1). Thus, to be able to use 
the CLT, instead of looking at X directly, we can look at Y. Note that 
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P(type I error) = P(\X — 501 > t\Ho) = P 


X — 50 


t 

> 5 


Ho 


t 


= P\\Y\>-\H 0 . 


For simplicity, let's put C = ^, so we can summarize our test as follows: 


If|y| < C, accept #0- 

If|y| > C, accept H\. 

where Y = —=—. Now, we need to decide what c should be. We need to have 
5 

a = P(\Y\ > c ) 

= i — p (—c < y < c) 

« 2 — 2$ (c) (using $(tc) = 1 — $(— x)). 

Thus, we need to have 

2 — 2$(c) = 0.05 

So we obtain 

c = $ _1 (0.975) = 1.96 

Thus, we conclude the following test 


If |y | < 1.96, accept Hq. 

If |y| > 1.96, accept H\. 

The set A = [—1.96, 1.96] is called the acceptance region, because it includes the 
points that result in accepting Hq. The set R = (-00,-1.96) U (1.96, oo) is 
called the rejection region because it includes the points that correspond to rejecting 
Hq. Figure 8.9 summarizes these concepts. 


https://www.probabilitycourse.com/chapter8/8_4_1Jntro.php 


3/4 



9/18/2018 


Introduction 


PDF of Y under Hq 



A = Acceptance Region 
R = i?i U i ?2 = Rejection Region 
a = P(type I error) = areai + area 2 = 0.05 
Figure 8.9 - Acceptance rejection, rejection region, and type I error for 

Example 8.22 

50 

Note that since Y = —-—, we can equivalently state the test as 

If |-X" — 501 < 9.8, accept Hq. 

If \X — 501 > 9.8, accept H\. 

Or equivalently, 


If the observed number of heads is in {41,42, • • • , 59}, accept Hq. 

If the observed number of heads is in {0, 1, • • • , 40} U {60, 61, • • • , 100}, 
reject Hq (accept Hi ). 

In summary, if the observed number of heads is more than 9 counts away from 50, we 
reject Hq. 


Before ending our discussion on this example, we would like to mention another point. Suppose that 
we toss the coin 100 times and observe 55 heads. Based on the above discussion we should accept 
Hq. However, it is often recommended to say "we failed to reject Hq" instead of saying "we are 
accepting Hq.” The reason is that we have not really proved that Hq is true. In fact, all we know is 
that the result of our experiment was not statistically contradictory to Hq . Nevertheless, we will not 
worry about this terminology in this book. 

<— previous 
next —>• 


https://www.probabilitycourse.com/chapter8/8_4_1_intro.php 


4/4 











